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THOMAS R. BURKE (CA State Bar No. 141930) 

DAVIS WRIGHT TREMAINE LLP 

505 Montgomery Street, Suite 800 

San Francisco, California 94111 

Telephone: (415)276-6500 

Facsimile: (415)276-6599 

Email: thomasburke@dwt.com 

RONALD G. LONDON (Pro Hac Vice) 

DAVIS WRIGHT TREMAINE LLP 
1919 Pennsylvania Ave., N.W., Suite 800 
Washington, DC 20006 
Telephone: (202) 973-4200 

Email: ronnielondon@dwt.com 

DAN LAIDMAN (State Bar No. 274482) 

DAVIS WRIGHT TREMAINE LLP 
865 South Figueroa Street, Suite 2400 
Los Angeles, CA 90017-2566 
Telephone: (213)633-6800 

Facsimile: (213)633-6899 

Email: danlaidman@dwt.com 

DAVID HALPERIN (Pro Hac Vice) 

1530 P Street NW 
Washington, DC 20005 
Telephone: (202) 905-3434 

Email: davidhalperindc@gmail.com 

Attorneys for Plaintiff Public.Resource.Org 


IN THE UNITED STATES DISTRICT COURT 
THE NORTHERN DISTRICT OF CALIFORNIA 
SAN FRANCISCO DIVISION 


PUBLIC.RESOURCE.ORG., a California non- ) Case No. 3:13-CV-02789-WHO 

profit organization, ) 

) DECLARATION OF CARL MALAMUD 

Plaintiff, ) 

) 

v. ) 

) 

UNITED STATES INTERNAL REVENUE ) 

SERVICE, ) 

) 

Defendant. ) 

_ ) 
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I, Carl Malamud, declare as follows: 

1. Since 2007,1 have been the President and Founder of Public.Resource.Org, a 
nonprofit corporation and the Plaintiff in this FOIA action. I have personal knowledge of the 
matters stated in this declaration and could competently testify to them if called as a witness. 

Mr. Malamud’s Background and Experience 

2. My formal education was in Business Economics and Public Policy at the Indiana 
University School of Business where I completed all coursework for the Doctorate in Business 
Administration and received an MBA in 1982. 

3. From 1982 to 1992,1 worked professionally in the field of computer networks, 
including positions at the Board of Governors of the Federal Reserve System, numerous 
consulting engagements with government groups such as the Department of Defense, wrote as a 
Contributing Editor and columnist for numerous trade publications such as Communications 
Week, and authored 8 professional reference books. 

4. From 1993 to 1996,1 served full-time as the founder and executive director of the 
Internet Multicasting Service, where I started and ran the first radio station on the Internet. As 
part of my work at the Internet Multicasting Service, I was also responsible for putting the U.S. 
Securities and Exchange Commission EDGAR system on the Internet and then donating 
computers and software to the SEC so they could take my system over. I was also responsible for 
putting numerous other government databases on the Internet for the first time, including the U.S. 
Patent database. 

5. In 1998 and 1999,1 was the CEO of Invisible Worlds. During that period, I 
worked with my Chief Technology Officer, Dr. Marshall T. Rose, to help develop the tools used 
to produce Internet Standards. These tools are based on the XML markup language, which is the 
same language that the IRS uses for their Modernized e-File (MeF) format. These tools continue 
to be used as the basis for authoring documents for the Internet standards process. The 
specifications for this work have been published as Internet Request for Comments 2629, “Writing 
I-Ds and RFCs using XML.” That standard may be found at http://tools.ietf.org/html/rfc2629 . 
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6. In 2004,1 was a consultant on documentation strategies to the Internet Systems 
Consortium, a nonprofit corporation that produces software essential to the operation of the 
Domain Name System. I was the founding Chairman of the Board of the Internet Systems 
Consortium in 1994. ISC is the author and publisher of BIND, which is used by many large 
Domain Name Servers throughout the world and also operates the “F” Root Name server, which is 
one of the core authoritative name servers that make the Internet function. As a consultant on 
documentation strategies, I spent a great deal of time working with Docbook, an XML-based 
authoring language for technical documentation. 

7. In 2007,1 founded Public.Resource.Org, a nonprofit corporation which is based in 
California. We are responsible for placing the historical opinions of the U.S. Court of Appeals 
back to the founding of the court on the Internet for the first time. As part of that work we 
discovered numerous Social Security Numbers (SSNs) in those opinions and notified the Court of 
the presence of this information. On July 16, 2008, Chief Judge Lee H. Rosenthal thanked us for 
our efforts on behalf of the Committee on Rules of Practice and Procedure of the Judicial 
Conference of the United States. That letter may be found at 
https://public.resource.org/scribd/7512576.pdf . 

8. In 2008 and 2009,1 conducted a series of audits on 20 million pages of PACER 
documents and discovered numerous SSNs. We notified the Chief Judges of 32 U.S. District 
Courts of these findings and this resulted in changes in the privacy procedures for the PACER 
documents and acknowledgment of our efforts by several Chief Judges and by the Committee on 
Rules of Practice and Procedure of the Judicial Conference of the United States. 

9. In 2007 and then again in 2010,1 submitted reports to the Speaker of the House of 
Representatives concerning my recommendations for broader availability of video from 
Congressional hearings. On January 5, 2011, the Speaker of the House acknowledged my efforts 
and authorized me to work with the Committee on Oversight and Government Reform and the 
House Broadcast Studio, an effort that led to the posting of over 14,000 hours of video from 
Congressional hearings. The letter from the Speaker may be found at 
https://law.resource.org/rfcs/gov.house.20110105.pdf . 
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10. In 2008,1 served as an advisor to the Presidential transition, where I outlined a 
series of proposed changes in how the Official Journals of Government, including the Federal 
Register, can be published. Those changes were implemented and have resulted in a substantial 
improvement in the online system, which is visible at federalregister.gov. 

11. In 2008,1 began a program called FedFlix in cooperation with the National 
Technical Information Service (NTIS) and the National Archives and Records Administration. 

The program sent volunteers into the National Archives to copy videos and obtained copies of 
video from numerous agencies, including the Department of Defense, OSHA, and the Mine Health 
and Safety Administration. Approximately 6,000 videos were copied and posted to YouTube and 
the Internet Archive and have since been viewed over 50 million times. 

Mr. Malamud’s Work with the IRS Exempt Organizations Database 

12. In 2008,1 began working with the IRS Exempt Organizations database by 
submitting payment for 6 years of DVDs and developing software to process that data and post it 
on the Internet with no restrictions on use. Since 2008,1 have processed and posted on the 
Internet over 7,634,050 instances of the Form 990 filed by Exempt Organizations. The data that I 
processed was made available on our servers, on nonprofit services such as the Internet Archive, 
and forms the basis for numerous other commercial and non-commercial systems that analyze and 
host Fonn 990 data. Our archive of Form 990s is the only one freely available on the Internet with 
no restrictions on access or use. We make this data available free of charge and with no 
restrictions, just as we have with court documents and numerous other government databases, 
because we believe that these Works of Government should be more broadly available. 

13. As part of my work, I performed audits of the Exempt Organizations database 
looking for instances of where the IRS has released individuals’ SSNs as part of its release of 
Form 990 data. Our best estimate is that there are close to 600,000 SSNs in the Exempt 
Organizations data we purchased from the IRS. When I find SSNs in a Form 990,1 redact that 
information and replace the files we made available for public view. I also systematically notify 
the IRS, GuideStar, the Foundation Center, and others who I know have copies of this database. 
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14. On July 2, 2013,1 notified the IRS and the Treasury Inspector General for Tax 
Administration (TIGTA) of a large number of Social Security Numbers for political organizations 
filing under Section 527 that were on the IRS web site. That notification can be found at 
https://bulk.resource.org/irs.gov/eo/doc/irs.gov.20130702.pdf . The Inspector General assigned 
complaint number 63-1307-0025-C to their investigation of this matter. 

15. On July 15, 2013, Congressman Tom Latham and 41 other members of the House 
of Representatives wrote to the Acting Commissioner of the Internal Revenue Service to request 
an explanation of this privacy breach. That letter may be found at 
https://bulk.resource.org/irs.gov/eo/doc/irs.gov.20130715.pdf . 

16. On September 16, 2013, the Acting Commissioner wrote to Congressman Tom 
Latham and informed the Congress that the IRS had changed the position on redaction of Social 
Security Numbers. That letter may be found at 
https://bulk.resource.org/irs.gov/eo/doc/irs.gov.20130916.pdf. 

17. On December 6, 2013, the Internal Revenue Service updated section 3.20.13.13.2 
of the Internal Revenue Manual to permit redaction of Social Security Numbers for Section 527 
Political Organizations. Those changes were effective January 1, 2014. This section of the IRM 
may be found at http://www.irs.gov/irm/part3/inn 03-020-013r.html . 

18. On April 22, 2014,1 notified the IRS Commissioner and the Inspector General of a 
large number of Social Security Numbers in returns for Exempt Organizations that are not 
Political Organizations. That letter may be found at 
https://bulk.resource.org/irs.gov/eo/doc/irs.gov.20140422.pdf. 

19. On July 7, 2014,1 concluded the audit of SSNs and sent the IRS Commissioner and 
the Inspector General detailed audit results, including copies of 9,392 returns that I had redacted 
with detailed recommendations on steps the IRS should take to mitigate this problem. The cover 
letter for this audit may be found at https://bulk.resource.org/irs.gov/eo/doc/irs.gov.20140707.pdf. 
The Inspector General assigned complaint number 63-1407-0060-C to their investigation of this 
matter. 
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20. On July 24, 2014,1 notified the IRS of my analysis of the April, 2014 shipment of 
returns. In that notification, I informed the IRS of a major privacy breach for an exempt 
organization that had e-filed their results. That notice can be found at 
https://bulk.resource.org/irs.gov/eo/doc/irs.gov.20140707.pdf . 

21. In order to find privacy breaches in Exempt Organization filings, I am forced to use 
Optical Character Recognition. For the April, 2014 results, this required running OCR on 546,631 
pages of returns. I started that process on July 18 and by devoting a 12-CPU system entirely to the 
task, was able to process 177,144 pages per day. The process was completed on July 22. 

22. In addition to taking a lot of time, in my considerable experience, using OCR is 
inherently inaccurate. For example, the letter O can easily be confused with the number 0. 

Mr. Malamud’s Work with IRS Form 990. 

23. As part of my work on the IRS Exempt Organizations database, I have carefully 
examined the documentation on the Modernized e-File (MeF) fonnat. That information can be 
found at http://www.irs.gov/Tax-Professionals/e-File-Providers-&-Partners/Modemized-e-File- 

Program-Information . 

24. I have read and am familiar with the MeF Submission Composition Guide which 
details the structure of an e-file submission, including the XMF format for a submitted return, the 
“envelope” for that submission in the SOAP format (which is also based on XMF), and the rules 
for submitting attachments as PDF files. That guide may be found at http://www.irs.gov/pub/irs- 
schema/MeF Submission Composition Guide vl-4.pdf . 

25. I have read and am familiar with the Schemas and Business Rules for Exempt 
Organizations, including Forms 990, 990EZ, 990-N, 990-PF, 1120-POF, and 8868 as well as 
Corporate Forms 1120, 1120S, and 7004. That information may be found at 

http ://www. irs. gov/Charities-&-Non-Profits/Current-V alid-XMF-Schemas-and-Business-Rules- 

for-Exempt-Organizations-Modemized-e-File . 

26. The IRS does not provide a sample instance of an XMF file for the Form 990 or 
Form 990-PF. However, I was able to examine a sample instance of an XMF file for a corporate 
return based on Form 1120. That file is contained in the IRS publication “2014 Valid XMF 
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Schemas and Business Rules for 1120, 1120S, 1120-F, and 7004 Modernized e-File (MeF).” That 
information can be found at http ://www. irs. gov/Tax-Professionals/e-File-Providers-&- 
Partners/2014-Valid-XML-Schemas-and-Business-Rules-for-l 120-1120S-1120-F-and-7004- 

Modernized-e-File. 

27. The name of the file that I examined is 

ExampleTransmissionWithConsolidatedRetum.xml. A copy of the file I examined is available at 
https://bulk.resource.org/irs.gov/eo/doc/doc/Example TransmissionWithConsolidatedRetum.xml . 

28. In order to remove (redact) one element nested inside an XML file, I use a common 
programmers tool called a “text editor.” Any professional programmer has access to such 
software. I use a text editor called bbedit on my Apple computer. Other examples of text editors 
are “vi” on any Unix computer, and “notepad” on any Windows computer. I used the bbedit 
software on the file named ExampleTransmissionWithConsolidatedReturn.xml, removed the 
element IRS 1120LScheduleB, and saved the file with a new name. That entire process took me 
57 seconds. 

29. There are a number of techniques used to transform and process XML files. A 
common technique is the use of Style Sheets, a standard defined by the World Wide Web 
Consortium, the standards-making body for the World Wide Web. The definition of Extensible 
Stylesheet Transformations (XSLT) may be found at http://www.w3.org/TR/xslt . 

30. The IRS uses this technique to publish a number of sample files that can be used to 
transform returns in MeF. These style sheets can be used by businesses, tax preparers, and others 
to transform a return into another format, such as transforming the XML into HTML for display in 
a web browser. The IRS publishes these style sheets at http://www.irs.gov/Tax-Professionals/e- 
File-Providers-&-Partners/Modernized-e-File-MeF-Stvlesheets . 

31. I wrote a very simple style sheet, a true and correct copy of which is attached as 
Exhibit A, that is based on something called an “identity transformation.” An identity 
transformation is a style sheet that copies everything that is input to the output with no changes. 

An example of the identity transformation may be found in Section 7.5 of the XSLT specification. 

I added a single line to the style sheet which copies every element except the element 
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IRS 1120LScheduleB. It took me almost one hour to write this style sheet because it had been 
several years since I looked at style sheets and had to use Google to understand how to list the 
namespaces that the IRS uses. Using a free open source program which comes on my computer 
called xsltproc, I was able to specify the name of an input file, the name of the style sheet, and the 
name of the output file. I ran that program and produced an XML file with the Schedule B 
removed. It took 1.429 seconds to execute this command on my desktop computer. I ran this 
program on a single instance of a Form 990, but this program could also be used, without 
modification, to process hundreds or thousands of instances of the Form 990. It can also be easily 
modified to remove multiple schedules. 

32. Availability of returns in MeF format are significantly easier to work with than the 
bitmap files produced by the IRS and shipped on DVDs. For my particular application, finding 
Social Security Numbers in current returns, having the e-file data would have saved me a week of 
initial processing of the data and would have found much more reliable results. 

33. In addition to locating SSNs, the availability of the data in MeF format would 
unlock a large number of other applications. For example, in order to find returns in our collection 
of over 7.5 million Form 990s, computer programs must use a variety of search indices. With the 
data the IRS currently provides, we know the name of the nonprofit and rudimentary information 
such as the city, state, date of filing, and assets. If information were available in MeF fonnat, 
much more useful search capabilities would be possible using all of the data fields in the return to 
help the public readily access the information that they desire. 

34. Public.Resource.Org’s request for Exempt Organization returns in MeF format 
instead of bitmap images would be of substantial use to perfonn audits for privacy violations of 
Exempt Organization returns. If the MeF format data were available, I would be able to notify the 
IRS and other organizations with copies of this data more quickly about any breaches that were 
discovered. In addition to finding privacy breaches, there would be a large number of other 
beneficial applications in the public interest. It is my considered technical opinion, based on over 
30 years as a computer professional, extensive work with the XML standard, and 6 years of 
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experience with the IRS Exempt Organizations database that it is technically very easy to make 
the MeF version of these public filings available. 

I declare under penalty of perjury under the laws of the United States that the foregoing is 
true and correct and that this declaration was executed this- ^7^ day of September, 2014 at 
Sebastopol, California. 

CARL MALAMUD 
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